16 research outputs found

    Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers.</p> <p>Methods</p> <p>Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types.</p> <p>Results</p> <p>Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield.</p> <p>Conclusion</p> <p>Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.</p

    Influenza Outbreak during Sydney World Youth Day 2008: The Utility of Laboratory Testing and Case Definitions on Mass Gathering Outbreak Containment

    Get PDF
    BACKGROUND:Influenza causes annual epidemics and often results in extensive outbreaks in closed communities. To minimize transmission, a range of interventions have been suggested. For these to be effective, an accurate and timely diagnosis of influenza is required. This is confirmed by a positive laboratory test result in an individual whose symptoms are consistent with a predefined clinical case definition. However, the utility of these clinical case definitions and laboratory testing in mass gathering outbreaks remains unknown. METHODS AND RESULTS:An influenza outbreak was identified during World Youth Day 2008 in Sydney. From the data collected on pilgrims presenting to a single clinic, a Markov model was developed and validated against the actual epidemic curve. Simulations were performed to examine the utility of different clinical case definitions and laboratory testing strategies for containment of influenza outbreaks. Clinical case definitions were found to have the greatest impact on averting further cases with no added benefit when combined with any laboratory test. Although nucleic acid testing (NAT) demonstrated higher utility than indirect immunofluorescence antigen or on-site point-of-care testing, this effect was lost when laboratory NAT turnaround times was included. The main benefit of laboratory confirmation was limited to identification of true influenza cases amenable to interventions such as antiviral therapy. CONCLUSIONS:Continuous re-evaluation of case definitions and laboratory testing strategies are essential for effective management of influenza outbreaks during mass gatherings

    Combining Free Text and Structured Electronic Medical Record Entries to Detect Acute Respiratory Infections

    Get PDF
    The electronic medical record (EMR) contains a rich source of information that could be harnessed for epidemic surveillance. We asked if structured EMR data could be coupled with computerized processing of free-text clinical entries to enhance detection of acute respiratory infections (ARI).A manual review of EMR records related to 15,377 outpatient visits uncovered 280 reference cases of ARI. We used logistic regression with backward elimination to determine which among candidate structured EMR parameters (diagnostic codes, vital signs and orders for tests, imaging and medications) contributed to the detection of those reference cases. We also developed a computerized free-text search to identify clinical notes documenting at least two non-negated ARI symptoms. We then used heuristics to build case-detection algorithms that best combined the retained structured EMR parameters with the results of the text analysis.An adjusted grouping of diagnostic codes identified reference ARI patients with a sensitivity of 79%, a specificity of 96% and a positive predictive value (PPV) of 32%. Of the 21 additional structured clinical parameters considered, two contributed significantly to ARI detection: new prescriptions for cough remedies and elevations in body temperature to at least 38°C. Together with the diagnostic codes, these parameters increased detection sensitivity to 87%, but specificity and PPV declined to 95% and 25%, respectively. Adding text analysis increased sensitivity to 99%, but PPV dropped further to 14%. Algorithms that required satisfying both a query of structured EMR parameters as well as text analysis disclosed PPVs of 52-68% and retained sensitivities of 69-73%.Structured EMR parameters and free-text analyses can be combined into algorithms that can detect ARI cases with new levels of sensitivity or precision. These results highlight potential paths by which repurposed EMR information could facilitate the discovery of epidemics before they cause mass casualties

    Technological Trends in the Sport Field: Which Application Areas and Challenges?

    No full text
    This paper investigates the application of new technologies in the sport field. Technology, mainly information technology (IT) and internet, is deeply changing the overall picture of the sport sector. New technologies facilitate the knowledge transfer in the sporting event management process, such as the Olympic Games; at the same time, the innovative techniques can significantly affect the athletes’ performance and the social integration of disabled persons. There is an explosion of technology applications in the sport field in different sub-organizational areas, but this phenomenon is still underrepresented in the literature. This paper aims to identify and evidence the main application areas and challenges faced by technology in the sport setting. This study, through a review of the literature, represents a research starting point that allows us to systematize and clarify the main contributions on this topic and to identify new research perspective

    Epidemic Surveillance Using an Electronic Medical Record: An Empiric Approach to Performance Improvement

    Get PDF
    <div><p>Backgrounds</p><p>Electronic medical records (EMR) form a rich repository of information that could benefit public health. We asked how structured and free-text narrative EMR data should be combined to improve epidemic surveillance for acute respiratory infections (ARI).</p><p>Methods</p><p>Eight previously characterized ARI case detection algorithms (CDA) were applied to historical EMR entries to create authentic time series of daily ARI case counts (background). An epidemic model simulated influenza cases (injection). From the time of the injection, cluster-detection statistics were applied daily on paired background+injection (combined) and background-only time series. This cycle was then repeated with the injection shifted to each week of the evaluation year. We computed: a) the time from injection to the first statistical alarm uniquely found in the combined dataset (Detection Delay); b) how often alarms originated in the background-only dataset (false-alarm rate, or FAR); and c) the number of cases found within these false alarms (Caseload). For each CDA, we plotted the Detection Delay as a function of FAR or Caseload, over a broad range of alarm thresholds.</p><p>Results</p><p>CDAs that combined text analyses seeking ARI symptoms in clinical notes with provider-assigned diagnostic codes in order to maximize the precision rather than the sensitivity of case-detection lowered Detection Delay at any given FAR or Caseload.</p><p>Conclusion</p><p>An empiric approach can guide the integration of EMR data into case-detection methods that improve both the timeliness and efficiency of epidemic detection.</p></div
    corecore